Deep Learning NYU/Week 1

Evolution of CNNs
- Started in 40s, died in 60s; 1984 started again
- Frank Rosenblatt built a physical model called a perceptron.
- Fukushima built a neural net that worked the same way as the brain.
  - Neurons are replicated across the visual field
    - The shift changes the activation of simple cells
  - Complex cells pool the information from simple cells
    - Orientation-selective units
- Android used neural nets in 2012 for speech recognition.
From vision
- Neurons in front of retina compress the data; these neurons come through the blind spot
- Invertebrates have it right
- Essentially a FeedForward process
- Retinotopic – neurons are arranged in the same way as the actual field
  - and also had some sense of orientation
Supervised learning
- Train a machine by showing examples instead of programming
- Classical ML
  - input -> feature extractor -> Trainable classifier
- Deep learning
  - Stacked multiple modules
Resnet-50 current workhouse for image recognitiond
Manifold Hypothesis
- Natural data lives in a low-dimensional manifold
- eg. natural images is a tiny subset of possible images
- Natural data is compositional
  - it is efficiently representable hierarchically
Why does DL work?
- why does it need so many layers
- no guarantees about convergence
- why is it so over-parametrized
Generic feature extraction
- expand dimensions to become linearly separable
- space tiling / random projections / polynomial classifier / radial basis functions / kernel machines
- 2 layer
SVM
- 2 layer neural net
Efficient parametrization of class of functions is very important for AI tasks
Scaling learning algorithms towards AI - Bengio & LeCun
- Trade off complexity/memory for CPU (roughly)
- Exchanging time and space
Deep models
Ideally can extract each factor of variation
- Independent explanatory factors of explanation
- Ultimate goal of representation learning
Dimensions in result metrix
- Height = output size
- Width = input size
The practicum is an excellent way to build intuition around the vector transformations from matrices.